enable optimizer in tests and procmacros #5155

alexpyattaev · 2025-03-05T17:34:40Z

Problem

Unittests are very slow: optimization is not enabled for them at all, so we wait hours for CI to complete
Builds are slow: optimizations for procmacros are not enabled either

Summary of Changes

Enable optimizer for unittests
Enable optimizer for procmacros

alexpyattaev · 2025-03-05T18:15:57Z

@willhickey @yihau please take a look

alexpyattaev · 2025-03-07T20:43:09Z

accounts-db/src/append_vec.rs

@@ -1830,7 +1823,6 @@ pub mod tests {
                {
                    let executable_bool: bool = account.executable();
                    assert!(!executable_bool);


Removed this assert as per advice from @roryharr

More Context: This assert was added a long time ago by ryoqun just to monitor some undefined behaviour.

The executable bit is a boolean which uses a byte of storage. The only valid values for the byte are zero and one. This is verified during account load by the function sanitize_executable.

This test corrupts the byte, and verifies that sanitize executable fails, which it does. Afterwards it has various asserts to see how it behaves in the event the sanitize check was not present. As per ryoquns comments these are purely for his interest.

When compiling with optimizations on, the compiler behaviour changes when accessing the byte by value.
u8 = unsafe { std::mem::transmute::<bool, u8>(executable_bool) }; returns the raw byte, rather than the underlying data converted to a bool, and then converted to a u8.

#1485 (comment)

@ryoqun In case you are still interested

alexpyattaev · 2025-03-07T20:43:24Z

accounts-db/src/append_vec.rs

@@ -1277,13 +1277,6 @@ pub mod tests {
            }
        }

-        fn get_executable_byte(&self) -> u8 {


this function was only used in a test

ilya-bobyr · 2025-03-07T22:31:25Z

Cargo.toml

+# Enable basic optimizations for unittests
+[profile.test]
+opt-level = 1


One of the reasons unit tests do not have optimization enabled, is to speed up the compilation for interactive development.
When working on a single unit test, you want the fastest edit/compile/run cycle.
Here is an example of a command I would run when I'm working on a specific unit test:

cargo test \ --package solana-core \ --test epoch_accounts_hash \ --all-features -- \ --exact --nocapture test_snapshots_have_expected_epoch_accounts_hash

Have you considered the increase in the compilation time when switching from opt-level 0 to 1 for these use cases?

I think, a common approach for speed improvements of this kind for CI, is to build unit tests with optimizations in CI only.
cargo supports custom profiles, so, I think, it might be possible to create a custom profile that CI will use, that would enable higher optimizations levels.

More details on the custom profiles can be found here:
https://doc.rust-lang.org/cargo/reference/profiles.html#custom-profiles

o1 is in many cases practically free in terms of compile times. For example:

running time cargo test --package solana-core --test epoch_accounts_hash

clean-build with optimizations (o1):

Executed in 177.98 secs fish external usr time 76.21 mins 0.00 micros 76.21 mins sys time 10.00 mins 839.00 micros 10.00 mins

clean-build with no optimizations (default settings):

Executed in 159.20 secs fish external usr time 40.46 mins 837.00 micros 40.46 mins sys time 8.29 mins 0.00 micros 8.29 mins

So, we would save 15 seconds of real build time (on full rebuild), while spending far longer waiting for the actual tests to run (assuming you are running more than one at a time). Some of our tests already take minutes to complete, especially the ones in local-cluster, and we only need to build stuff once to run all the tests. Full CI runs are almost always > 1 hour. o1 gives about 10x boost in speed compared to o0.

Another aspect here is that with optimizer disabled certain kinds of bugs such as race conditions may just never show up. Essentially we end up testing something other than what we run in prod in terms of actual generated assembly, and the differences can be quite dramatic.

o1 is in many cases practically free in terms of compile times. For example:

running time cargo test --package solana-core --test epoch_accounts_hash

clean-build with optimizations (o1):

Executed in 177.98 secs fish external usr time 76.21 mins 0.00 micros 76.21 mins sys time 10.00 mins 839.00 micros 10.00 mins

clean-build with no optimizations (default settings):

Executed in 159.20 secs fish external usr time 40.46 mins 837.00 micros 40.46 mins sys time 8.29 mins 0.00 micros 8.29 mins

So, we would save 15 seconds of real build time (on full rebuild), while spending far longer waiting for the actual tests to run (assuming you are running more than one at a time).

When working on a single test, I would be running just this one test.
And if the compilation is 15 seconds slower, most of these 15 seconds will be added to my edit/compile/run cycle. The speed-up in a single test execution is normally unnoticeable.

Are you using a multithreaded linker, such as mold?

Some of our tests already take minutes to complete, especially the ones in local-cluster

local-cluster is the most extreme example.
And it is not a unit test, but an integration test.
This is why it is a separate crate that can have optimization configured for it separately.

we only need to build stuff once to run all the tests. Full CI runs are almost always > 1 hour. o1 gives about 10x boost in speed compared to o0.

This is the CI use case.
And I think speeding up our CI is a great cause.

But for the developer use case, you normally compile one crate and run one test.
And if unit tests are small and fast, you do not benefit from the speed-up.
But suffer from the compilation time increase.

Another aspect here is that with optimizer disabled certain kinds of bugs such as race conditions may just never show up. Essentially we end up testing something other than what we run in prod in terms of actual generated assembly, and the differences can be quite dramatic.

While it is great to be able to find UB issues, I am not sure if this is a good argument here.
I do not have a comprehensive statistics, but my guess is that depending on the UB you may or may not observe it at different optimization levels.
So, it could be the switch between them, that is fruitful.
Or, you could miss it in any case.
You can not really rely on the unit tests to catch the undefined behavior.

While a 15-second increase in the compilation time, I believe, is a rather objective thing to consider, this reminded me of this XKCD:

15 sec is for a full rebuild after cargo clean. while incrementally building one test the difference is not really measurable.

ilya-bobyr · 2025-03-07T22:33:19Z

accounts-db/src/append_vec.rs

@@ -1277,13 +1277,6 @@ pub mod tests {
            }
        }

-        fn get_executable_byte(&self) -> u8 {


This change seems unrelated to the unit tests speed up.
Why not move it into a separate PR?

The only reason this change is happening is to make optimizations possible for unittests. For more substantial changes separate PRs have been made by teams handling respective pieces of the code. I'll be making some as well.

If this check is not really useful (as is claimed in the discussion) it can be done in a separate PR, as it has value by itself.
Making things simpler and avoiding UB, IIUC.
And merging smaller PRs is generally easier.

For example, should you end up introducing a separate compilation profile for the CI, this PR will increase in size and will take some time to land.

But this particular change might go in pretty quickly on its own.

It is up to you, of course. I was just speaking from personal experience :)

alexpyattaev mentioned this pull request Mar 5, 2025

enable optimizer in tests and procmacros anza-xyz/solana-sdk#67

Merged

Lichtso mentioned this pull request Mar 5, 2025

Fix - test_syscall_get_sysvar #5161

Merged

alexpyattaev and others added 3 commits March 7, 2025 20:13

enable optimizer in tests and procmacros

112b8f3

remove UB-checking assert from accountsdb test

9531071

hotfix for gossip test flaking

3a892cc

alexpyattaev force-pushed the enable_optimizer_in_tests branch from cebe934 to 3a892cc Compare March 7, 2025 20:26

eliminate dead code

40a2be1

alexpyattaev commented Mar 7, 2025

View reviewed changes

stragglers

e0ba967

ilya-bobyr reviewed Mar 7, 2025

View reviewed changes

also optimize deps to O2 level

3627c25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

enable optimizer in tests and procmacros #5155

enable optimizer in tests and procmacros #5155

alexpyattaev commented Mar 5, 2025

alexpyattaev commented Mar 5, 2025

alexpyattaev Mar 7, 2025

roryharr Mar 7, 2025 •

edited

Loading

roryharr Mar 7, 2025

alexpyattaev Mar 7, 2025

ilya-bobyr Mar 7, 2025

alexpyattaev Mar 7, 2025

ilya-bobyr Mar 7, 2025

ilya-bobyr Mar 7, 2025 •

edited

Loading

alexpyattaev Mar 8, 2025

ilya-bobyr Mar 7, 2025

alexpyattaev Mar 7, 2025

ilya-bobyr Mar 7, 2025

enable optimizer in tests and procmacros #5155

Are you sure you want to change the base?

enable optimizer in tests and procmacros #5155

Conversation

alexpyattaev commented Mar 5, 2025

Problem

Summary of Changes

alexpyattaev commented Mar 5, 2025

Choose a reason for hiding this comment

roryharr Mar 7, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ilya-bobyr Mar 7, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

roryharr Mar 7, 2025 •

edited

Loading

ilya-bobyr Mar 7, 2025 •

edited

Loading